Boston is dataset with 506 observations and 14 variables of housing values in the suburbs of Boston Correlation plot matrix and summaries of the variables can be found below
We observe middle-strong correlations (ca. 0.6) between the variables rad and crim, tax and crim, age and zn, dis and zn, nox and indus, age and indus, dis and indus, rad and indus, tax and indus, lstat and indus, age and nox, dis and nox, rad and nox, tax and nox, lstat and nox, lstat and rm, medv and rm, dis and age, lstat and age, tax and rad, lstat and medv. None of the variables is normally distributed, apart from rm that appears to follow a normal distribution.
| mean | sd | median | se | IQR | |
|---|---|---|---|---|---|
| crim | 3.6 | 8.6 | 0.3 | 0.4 | 3.6 |
| zn | 11.4 | 23.3 | 0.0 | 1.0 | 12.5 |
| indus | 11.1 | 6.9 | 9.7 | 0.3 | 12.9 |
| chas | 0.1 | 0.3 | 0.0 | 0.0 | 0.0 |
| nox | 0.6 | 0.1 | 0.5 | 0.0 | 0.2 |
| rm | 6.3 | 0.7 | 6.2 | 0.0 | 0.7 |
| age | 68.6 | 28.1 | 77.5 | 1.3 | 49.0 |
| dis | 3.8 | 2.1 | 3.2 | 0.1 | 3.1 |
| rad | 9.5 | 8.7 | 5.0 | 0.4 | 20.0 |
| tax | 408.2 | 168.5 | 330.0 | 7.5 | 387.0 |
| ptratio | 18.5 | 2.2 | 19.1 | 0.1 | 2.8 |
| black | 356.7 | 91.3 | 391.4 | 4.1 | 20.8 |
| lstat | 12.7 | 7.1 | 11.4 | 0.3 | 10.0 |
| medv | 22.5 | 9.2 | 21.2 | 0.4 | 8.0 |
Therefore, we standardised the data.
| mean | sd | median | se | IQR | |
|---|---|---|---|---|---|
| crim | 0 | 1 | -0.4 | 0 | 0.4 |
| zn | 0 | 1 | -0.5 | 0 | 0.5 |
| indus | 0 | 1 | -0.2 | 0 | 1.9 |
| chas | 0 | 1 | -0.3 | 0 | 0.0 |
| nox | 0 | 1 | -0.1 | 0 | 1.5 |
| rm | 0 | 1 | -0.1 | 0 | 1.1 |
| age | 0 | 1 | 0.3 | 0 | 1.7 |
| dis | 0 | 1 | -0.3 | 0 | 1.5 |
| rad | 0 | 1 | -0.5 | 0 | 2.3 |
| tax | 0 | 1 | -0.5 | 0 | 2.3 |
| ptratio | 0 | 1 | 0.3 | 0 | 1.3 |
| black | 0 | 1 | 0.4 | 0 | 0.2 |
| lstat | 0 | 1 | -0.2 | 0 | 1.4 |
| medv | 0 | 1 | -0.1 | 0 | 0.9 |
We fitted a linear discriminant analysis to the target variable crime and its classes. We divided the standardised Boston data set into a training and a test set, with 80% of the data assigned to the training dataset.
We plotted this lda model in the following biplot
## NULL
| [-0.419,-0.411] | (-0.411,-0.39] | (-0.39,0.00739] | (0.00739,9.92] | |
|---|---|---|---|---|
| [-0.419,-0.411] | 16 | 17 | 1 | 0 |
| (-0.411,-0.39] | 2 | 18 | 6 | 0 |
| (-0.39,0.00739] | 0 | 8 | 10 | 2 |
| (0.00739,9.92] | 0 | 0 | 0 | 22 |
As noted in the biplot the predictor variable rad ( radial highway )predicts a high crime per capita ( blue). On the other hand a high proportion of residential land zoned for lots over 25,000 sq.ft. (variable zn) predicts a low crime per capita. The middle crime per capita depicted in red in green has overlaps and is predicted by multiple predictors.
We predicted classes with the lda model and observed that the lda model predicts very efficently the crime rates above the mean (i.e. higher crime rates), but fails to distinguish the lower crime classes in an effective manner.